This is an update to my previous take on Bayesian AB testing. Specifically, I looked at taking advantage of prior knowledge that the change in conversion rate shouldn't be large -- maybe because you've run dozens of similar tests and you're making only a small modification to the website. Previously I described a sort of multi-level beta-binomial model to account for that prior knowledge, but it turns out that regularized logistic regression is simpler and gives pretty much the same answer.
We start with a baseline conversion rate $\mu$, from an uninformative prior*. Then, based on your prior knowledge of how likely the conversion rates will differ from one another, select a spread term $\sigma$. Next the two conversion rates $p_A$ and $p_B$ are drawn from $\text{logistic} \left[ \mathcal{N}(\mu, \sigma) \right]$. We then observe visits to the website which are Bernoulli distributed with rates $p_A$ and $p_B$.
* It's simple enough to use prior information for this too, but it turns out to have very little effect on the answer.
TODO: a little bit on how this model is just regularized logistic regression
You can solve for the maximum-likelihood estimates of $p_A$ and $p_B$ directly, but it's nice to have a full distribution to work with. For example, you might want to report various credible intervals of interest. To get the distributions, I'll use Stan.
In [102]:
import pandas as pd
import matplotlib.pyplot as plt
from numpy import exp
from pystan import StanModel
# pystan gives some annoying warnings when sampling
import warnings
warnings.filterwarnings('ignore')
logistic = lambda x: 1/(1+exp(-x))
The Stan code for the 'naive', usual beta-binomial model:
In [115]:
naive_model_code = '''
data {
int<lower=0> successes[2];
int<lower=0> trials[2];
real<lower=0> alpha0;
real<lower=0> beta0;
}
parameters {
vector[2] p;
}
model {
p ~ beta(alpha0, beta0);
successes ~ binomial(trials, p);
}
'''
naive_model = StanModel(model_code=naive_model_code)
The Stan code for the improved beta-binomial model from the previous notebook:
In [103]:
bb_model_code = '''
data {
int<lower=0> successes[2];
int<lower=0> trials[2];
real<lower=0> alpha0;
real<lower=0> beta0;
real<lower=0> tau;
}
parameters {
real mu;
vector[2] p;
}
transformed parameters {
real<lower=0> alpha1;
real<lower=0> beta1;
alpha1 <- tau*mu;
beta1 <- tau*(1-mu);
}
model {
p ~ beta(alpha1, beta1);
successes ~ binomial(trials, p);
}
'''
bb_model = StanModel(model_code=bb_model_code)
The Stan code for the new logistic regression model:
In [104]:
lr_model_code = '''
data {
int<lower=0> successes[2];
int<lower=0> trials[2];
real<lower=0> sigma;
}
parameters {
real mu;
vector[2] coef;
}
model {
coef ~ normal(mu, sigma);
successes ~ binomial_logit(trials, coef);
}
'''
lr_model = StanModel(model_code=lr_model_code)
In [108]:
successesA = 300
successesB = 350
failuresA = 3000
failuresB = 3000
successes = [successesA, successesB]
trials = [successesA+failuresA, successesB+failuresB]
pA, pB = successesA/(successesA+failuresA), successesB/(successesB+failuresB)
pA, pB, (pB-pA)/pA
Out[108]:
In [122]:
stan_data = dict(successes=successes, trials=trials, alpha0=1, beta0=1)
naive_fit1 = naive_model.sampling(data=stan_data)
naive_fit1
Out[122]:
In [123]:
stan_data = dict(successes=successes, trials=trials, alpha0=30, beta0=270)
naive_fit2 = naive_model.sampling(data=stan_data)
naive_fit2
Out[123]:
In [112]:
stan_data = dict(successes=successes, trials=trials, alpha0=3, beta0=27, tau=1500)
bb_fit = bb_model.sampling(data=stan_data)
bb_fit
Out[112]:
In [110]:
stan_data = dict(successes=successes, trials=trials, sigma=.09)
lr_fit = lr_model.sampling(data=stan_data)
lr_fit
Out[110]:
Combine all the estimates into one dataframe for plotting:
In [124]:
p = pd.DataFrame(naive_fit1.extract()['p'], columns=['A', 'B'])
naive_data1 = pd.DataFrame(dict(value=(p.B-p.A)/p.A))
naive_data1['variable'] = 'naive flat prior'
p = pd.DataFrame(naive_fit2.extract()['p'], columns=['A', 'B'])
naive_data2 = pd.DataFrame(dict(value=(p.B-p.A)/p.A))
naive_data2['variable'] = 'naive informative prior'
p = pd.DataFrame(bb_fit.extract()['p'], columns=['A', 'B'])
bb_data = pd.DataFrame(dict(value=(p.B-p.A)/p.A))
bb_data['variable'] = 'improved beta-binomial'
p = pd.DataFrame(logistic(lr_fit.extract()['coef']), columns=['A', 'B'])
lr_data = pd.DataFrame(dict(value=(p.B-p.A)/p.A))
lr_data['variable'] = 'logistic regression'
data = pd.concat([naive_data1, naive_data2, bb_data, lr_data])
In [125]:
data.groupby('variable').value.plot(kind='kde')
plt.legend()
plt.xlim(-.1, .4);